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Abstract 

We study the efficiency of a neural-net filter and deconvolution method for estimating 
jet energies and spectra in high-background reactions such as nuclear collisions at the 
relativistic heavy-ion collider and the large hadron collider. The optimal network is 
shown to be surprisingly close but not identical to a linear high-pass filter. A suitably 
constrained deconvolution method is shown to uncover accurately the underlying jet 
distribution in spite of the broad network response. Finally, we show that possible 
changes of the jet spectrum in nuclear collisions can be analyzed quantitatively, in 
terms of an effective energy loss with the proposed method. 
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1 Introduction 



Jet analysis has been proposed as one of the tools to probe dense matter produced in 
high-energy AA reactions because of their sensitivity to the energy-loss mechanisms 
and infrared correlation scales [1, 2]. However, identifying jets and estimating their 
total energy in AA reactions poses a practical challenge because of the large back- 
ground of low-transverse-energy hadrons produced along with the rare jets. Conven- 
tional methods of jet analysis developed for pp collisions [3, 4] begin to fail in pA 
collisions [5] due to the enhanced nuclear background and can be expected to fail 
completely for future applications to nuclear collisions at the relativistic heavy-ion 
collider (RHIC) and the large hadron collider (LHC) [6]. The question addressed in 
this paper is whether the powerful pattern-recognition techniques recently developed 
in the field of artificial neural networks [7] could help overcome this problem. We 
show below that neuro computing techniques do in fact look promising for the present 
application. 

In particular, we study the efficiency of feed-forward networks (FFN) for applica- 
tion to jet analysis. We show that a high-pass linear neural filter can be trained (using 
Monte Carlo event generators [2] or ideally pp data) to provide a nearly-bias-free es- 
timator of the jet energy distribution even in the presence of a very high level of low 
transverse momentum "noise" . In addition, we show that knowledge of the neural 
response function allows us to deconvolute the filtered jet distribution and recover the 
underlying "primordial" jet distribution to a surprising high degree of accuracy. In 
addition, in the case of most physical interest, where the jet-fragmentation function 
becomes significantly modified by the dense nuclear medium, the method proposed 
leads to a quantitative estimate of the average energy loss. 

To put this problem into perspective, we recall that perturbative quantum chro- 
modynamics (PQCD) predicts that in collisions of high-energy hadrons or nuclei, 
occasional high-momentum-transfer parton scattering processes lead to a calculable 
primordial distribution, I(E, rj 0} </> ), of quarks and gluons with transverse energy 
E > 2 GeV, pseudorapidity rj = — logtan# /2, and azimuthal angle <f> . Those par- 
tons fragment into a jet of secondary hadrons with highly correlated momenta which 
we denote by (e a , rj a} cf> a ). Here e a is the transverse energy, rj a the pseudorapidity, and 
<f) a the azimuthal angle of hadron a fragmenting from the jet parton. The problem of 
jet analysis is to identify only those hadrons out of the total multiplicity which are 
fragments from the jet and reject hadrons from background processes due to a variety 
of other dynamical mechanisms (pedestal effect, beam jets, multiple mini-jets). The 
objective then is to reconstruct the kinematics of the primary jets and the primordial 
distribution I(E, rj 0} <f> ). 
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Conventional methods for jet identification utilize the fact that most jet fragments 
are collimated into an angular cone [3] 



Therefore, the jet energy, as determined for example by a segmented calorimeter, is 
approximately given by 



where 6(x) is the Heaviside step function. However, this is a biased estimator of 
the initial parton energy E because the background processes contribute to the yield 
of hadrons with e a < E c ~ 2 GeV/c in the jet cone. Also, the jet hadronization 
mechanism can produce hadrons outside the angular cone R. Therefore, the measured 
output distribution O(Er) can be expected to differ significantly from the primordial 
input distribution 1(E). This distortion of the primordial spectrum of course becomes 
more severe as the low-frequency (i.e., low e a ) noise increases. For reactions such as 
e + e~ and pp the background noise is limited to a few particles per unit pseudorapidity. 
In this case Er is in fact an excellent estimator for Er > 10 GeV. However, in Au-\-Au 
collisions [2, 6] at RHIC energies, for example, the nonperturbative background is at 
least 400 times greater than in pp } and estimates with event simulators [2] indicate 
that the signal to noise ratio in (2) is on the order of unity for jets in the energy range 
10 < E < 40 GeV. 

Figure I shows a typical Au+Au event with two 30 GeV jets at RHIC as predicted 
with the Monte Carlo event generator HIJING [2]. Plotted are the transverse energies 
e a of all produced hadrons with e a > E c with E c = 0.2 and 2 GeV/c respectively as 
a function of their azimuthal angle <f> a . It is obvious from Figure I that most of the 
background particles have low e a and can be filtered out by setting E c ^2 — 3 GeV/c. 
Therefore, instead of adding the energies of all particles within a jet angular cone as 
in Eq. (2) it will pay to filter out first the low-frequency noise. This is only possible 
with a detector such as a time-projection chamber (TPC) since the momenta of all 
charged particles can be determined simultaneously. Detection of neutral particles 
requires in addition a highly segmented neutral energy calorimeter in conjunction 
with a TPC. 

While the simple filter above discards most of the background particles, it of 
course also discard valid jet fragments with e a < E c . This leads to an inevitable loss 
of information that would bias downward the estimator (2). The aim of this work is to 
develop a more robust estimator of the jet energy that can adaptively compensate for 
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the loss of information caused by filtering out the low-frequency noise. Our starting 
point, borrowed from the held of neurocomputation, is that FFN provide a powerful 
adaptive tool for approximating arbitrary R n — > R m mappings [8]. 

200 AGeV Au + Au -> 30 GeV jets +X (\rj\ < 1.5) 
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Figure 1: HIJING Monte Carlo [2] simulation of a = 200 A GeV central 
197 Au -\- 197 Au collision producing two jets with E = 30 GeV together with the 
associated soft and multi-mini-jet background. The pulse heights represent the 
transverse energy E of individual particles as a function of their azimuthal angle 
4> for \r)\ < 1.5. In the upper graph, all particles produced with E > 0.2 GeV 
are plotted. In the lower graph, only those that survive a high-pass filter with 
E > 2 GeV are plotted. 

An N layer FFN maps an input data array X = (x i} • • • , x n ) into an output array 
S = Oi, ■■■ ,s m ) via 

S = F(W N ---F(W 2 F(W 1 X))---) . (3) 

The rectangular n 8 - X connection matrices W 4 - together with response function(s) 
F(Y) = (/i(j/i), • • • } fk(yk)) dehne the mapping. The fi are typically parameterized 
in terms of a sigmoid type functions, but linear functions are sometimes sufficient for 
the task. The number of layers (connectivity matrices) and the block structure and 
dimensionality of the connectivity matrices dehne the architecture of the network. 
FFN are especially useful because they can be "taught", in principle, an arbitrarily 
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P t > 0.2GeV 



P t > 2.0GeV 



complex mapping through a variety of simple learning algorithms [7]. They are of 
practical interest because they can, in principle, also be implemented in hardware via 
fast, parallel, analog VLSI technology [9]. This last feature of FFN is of special interest 
for high-energy and nuclear physics due to the growing need for faster triggering 
and rapid information processing to cope with the ever increasing rate and volume 
of data produced by modern detectors. The adaptivity and speed of FFN has been 
emphasized recently in several other applications to high-energy-physics problems [10, 
11, 12, 13]. 

2 Neural Network Jet Filters 

We concentrate in this paper on a specific aspect of this problem, namely whether the 
information loss due to filtering the data can be efficiently compensated for using a 
FFN. In principle, the input to the network is the array of transverse energies within 
an angular cone R. The momenta and energies of produced particles are presumed to 
be determined by a first stage tracking algorithm (see a recent discussion of adaptive 
tracking methods in Ref. [14]). In our numerical simulations, however, we restrict 
the study to a distribution of isolated quark jets as our aim here is to illustrate the 
power of the method rather than deal with all the complications of nuclear reactions 
at once. 

2.1 Network Architecture 

We consider a network architecture as illustrated in Figure 2. The first layer of 
our FFN is just a simple threshold high-pass filter which only passes the transverse 
energies of particles with e a > E c . The output of this first layer is then sorted with 
transverse energies in decreasing order. This is the only nonlinear operation that we 
consider here. The sort is performed to allow the subsequent layer to utilize possible 
correlations among leading hadrons. We denote the sorted vector of filtered transverse 
energies by 

e k = {e k , e\, ■ ■ ■ , e\\e k = 1 and e\ > e\ > ■ ■ ■ > E c } . (4) 

We refer to e k as the transverse energy of the j th rank hadron in an event where k 
hadrons pass the filter. The first rank hadron is the one with the largest energy in 
the jet cone, etc.. The zeroth component = 1 GeV is added for later notational 
convenience. Note that R and E c are parameters of the network. 

In the next layer, we introduce a linear "neuron" for every k with a connection 
weight vector w k = {wq, w^, ■ ■ ■ , w^}. Neuron k only responds if k hadrons pass the 
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filter threshold and its output is used as the estimator of the jet energy, 

k 

E' = w k ■ e k = £>fe? • ( 5 ) 

i=0 

Note that since = 1 GeV, the component Wq acts as an external bias which has the 
physical interpretation as the missing energy in GeV caused by the high-pass filter. 



Structure of the Feed Forward Network 




threshold sort sum 



Figure 2: Illustration of the neural filter network. The first layer filters out 
particles with energy e a < E c . The second layer sorts remaining transverse 
energies into vector e k = {e^, e\ , ■ ■ ■ , e\~\ with e § = 1 and e\ > e\ > ■ ■ ■ > E c . 
The third layer estimates the jet energy via E' = w k -e k using weights wf trained 
on sample data. 

The problem then is to determine the weights given the threshold E c and jet cone 
R such that E k becomes an unbiased estimator of the jet energy. In principle, E c and 
R should also be considered as variational parameters to optimize the performance 
of the net. However, these are fixed in our analysis for numerical simplicity. 

2.2 Network Parameters 

Suppose that Vk(e k } E) is the probability that a jet of known energy E fragments into 
k hadrons above threshold with e k . The performance of neuron k for estimating the 
jet energy can be measured via an error function: 

xt(E) = lJ(E-w k -e k yV k (e k ,E)de k ---de k 

= |(E^(^K- 2 ^E^A fc (^) + ^(^)) , (6) 

ij i 
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where A k (E) =< e k > is the mean energy of the zth rank hadron produced from a jet 



of energy E when only the leading k particles pass the filter, C k AE) =< e k e k > is the 



covariance of the zth and jth rank hadrons, and Pk(E) = J Vk(e k , E)de k ■ ■ ■ de\ is the 
probability that only the first k rank hadrons survive the high-pass filter cut. Note 
that P k , A k , and C k - are determined by the jet-fragmentation function, Vk(e k , E), 
which depends implicitly also on E c and R. 

Averaging over the primordial PQCD spectrum 1(E) of jets, a global error function 
for neuron k can be constructed as 

< Xl >= f xl(E)I(E)dE = §(5>,*7*u,J - 2£«W + Q k ) . (7) 

ij i 

In contrast to P k , A k , and Cfj, the Q k , F k , and T t k are dependent on the form of the 
QCD jet spectrum 1(E). 

We determine the neural weights, w k so as to minimize the global error function. 
Since < > is a positive definite quadratic form, it has one global minimum, and 
therefore the simplest learning dynamics can be used to train the network. That 
minimum can be easily found via the gradient decent equations 

dw k d<xl> 



dt d 



k 



J2T k w k + F k , (8) 



or simply solving the linear equation TW = F numerically. 

To test the network, the jet spectrum 1(E) was calculated via lowest order PQCD 
as in [1]. The integration over the fragmentation function was performed via Monte 
Carlo assuming all jets were back-to-back rj = quark-antiquark pairs for simplicity. 
The two-jet-fragmentation scheme of LUND JETSET6.3 [15] was used to generate 
the hadronic fragments. The transverse-energy threshold was fixed to be E c = 2 GeV. 
We emphasize that this is not meant to be a realistic simulation of nuclear collisions 
but only a simple model to illustrate the adaptive performance of FFN in this type 
of application. Table 1 lists the weights which were found to minimize the network 
global error on the above training data. 

The most striking result is that the weights w k for i > 1 turned out to be close 
to 1. This is largely due to sum rule for fragmentation J2 a e a = E, which requires 
that the weights approach unity as the threshold E c — > 0. For E c small compared to 
the typical jet energies, one can show that the deviation of the optimal weights from 
unity is in fact controlled by the correlation between the energies of the leading and 
filtered hadrons via 



((e k Y)-(e k y J \E 
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here e< = J2 a e k 9(E c — e k ) is the energy lost by the filter. Since by definition = 1 
GeV, the optimal values of is close to the average missing energy in GeV units. 
There is k dependence of the missing energy as the optimal weights of the leading 
rank 1 and 2 hadrons is generally slightly less than unit and more missing energy 
must be made up by Wq. 

Table 1: Optimal weights found by minimizing the network global error, as 
discussed in the text, are listed. 
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0.99 
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3.74 


0.98 


0.99 


0.97 


0.97 


0.96 


0.92 
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4.43 


0.96 


0.97 


0.92 


0.94 


1.08 


0.92 


0.85 
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5.82 


0.94 


0.85 


1.04 


0.98 


0.87 


1.03 


0.82 


0.64 



2.3 Network Response 

The response of the network of course has a finite range. Let R(E', E) be the proba- 
bility that the response is E' to an input jet of energy E. This response distribution 
is 

R(E', E) = I 8 (w k ■ e k - E')V k {e\E)de k 1 ■ ■ ■ de\ . (10) 
k J 

The response using the optimal weights discussed above is shown for E = 10, 20, 30, 40 
GeV jets in Figure 3. The bias of the network, 

6(E) = j(E' - E)R(E', E)dE' (11) 

measures the average shift of the estimated jet energy. The dispersion, 

a(E) = (J (E' - E) 2 R(E' } E)dE'^j 2 (12) 

measures the rms fluctuation around the average response. To see that the optimal 
weights lead to an unbiased estimator of the total energy note that 

6(E) = I (™ k ■ e * - E )^k(e k , E)de k ■ ■ ■ de\ 
k J 

k 

= T,(E*>i4(E)-EP k (E)) . (13) 

k i=0 
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The global bias is thus 

<8>= [6(E)I(E)dE = Y,(hw!F? 

J k i=0 

For the optimal weights 



<1wq 
dt 



(14) 
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Because = 1, T fe - = Fj , F fc = Q fc , the above equation implies that Y.t=o w tF t h 
Q k = 0. Consequently, < 8 >= 0, i.e., the optimal network weights guarantee the 
bias averaged over the spectrum vanishes. 

Network Response 
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Figure 3: The response distributions for initial jet energy equal to 10,20,30,40 
are shown separately. The total response probability, i.e., the percentage of 
events that fragment with at least two hadrons with e a > E c = 2.0 GeV, is 
0.53, 0.94, 0.97,0.98 with mean 9.48, 18.7, 28.9, and 38.9 GeV and rms width 
1.43, 2.27, 2.26, and 2.23 GeV for the four cases respectively. The curves are 
normalized relative to the input PQCD spectrum I(t) (solid). Also shown is the 
integrated output response spectrum 0(t) (dotted). In the simulation, the bin 
size is 1 GeV. 

The optimal weights also minimize the dispersion. Substituting (10) into (12), 

cT 2 (E) = Y,!^ k -e k -EfV k (e\E)de\...del = Y J ^xl{E) . (16) 

k J k 

The global square dispersion is then given by 

<o 2 >= f o 2 (E)I(E)dE = ^<xl> ■ (17) 
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Since the optimal weights minimize all < Xfc >? the global < a 2 > is also minimized. 

The output spectrum O(E') of the network is a convolution of the response dis- 
tribution R(E' , E) with the primordial input spectrum 1(E): 

0(E') = J R(E',E)I(E)dE . (18) 

Binning the input and output spectra into a histogram, we can express this convolu- 
tion in matrix form as 

<>■ y." i ■ (19) 

i 

Because the response distribution of the linear neuron has a finite dispersion, each 
point in the input spectrum (corresponding to jets of a given energy) will spread to 
nearby bins according the response distribution of the neuron at the point. This leads 
to an inevitable deformation of the input spectrum as seen in Figure 3. Note that 
the network is designed to respond only to jets with at least two leading hadrons 
passing through the filter. Therefore, the integrated output spectrum is also less 
than the integrated input one. In the next section we discuss a method to correct 
this systematic distortion of the primordial spectrum. 

3 Deconvolution 

Having established the parameters of the network, we turn next to the method of de- 
convolution for jet distribution analysis. The physics goal is to recover the primordial 
distribution from the distorted measured one. Naively, we would try to invert (19) 
by I = R~ 1 0. However, in general R is not symmetric and has zero eigenvectors not 
orthogonal to the others. Therefore, its inverse is ill-defined. 

3.1 The Objective Function 

The best we can do is to determine I such as to maximize the likelihood that O is 
observed given knowledge of the response R. Assuming high statistics such that the 
central limit theorem applies in each bin, the best fit is obtained by minimizing an 
objective function such as the % 2 

X 2 = lJ2(O k -N k ) 2 /*l , (20) 

k 

where = Y^i^kih ls the expected number of counts in bin k and \/N k is 

the expected variance of the number of counts in that bin. In the limit Nk 3> 1, 
required for the applicability of (20), a good estimate for the variance is obtained by 
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approximating a\ ~ O k ^> 1. Minimizing (20) with respect to /, we find that I must 
satisfy the following linear equation: TI = F } where 

Tij = ^2RkjRki/vl ~"52RkjRki/Ok , (21) 

h h 

and 

i^-EWa^E^ • ( 22 ) 

k k 

The error made in the above approximation on the right hand side decreases as O k : . 

3.2 Singular Value Decomposition 

What has been gained relative to (19) is that T is symmetric and thus has a complete 
set of real orthonormal eigenvectors. Unfortunately, there is no guarantee that all 
eigenvalues are non- vanishing, and in many practical cases in fact det T = 0. Hence, 
T -1 still does not exist in general. However, we can define its pseudo-inverse [16], 
T -1 such that T~ X T = 1 — P 0} where P is the projector onto the subspace of zero 
eigenmodes. In that case we can "solve" for I as 

/ = T- 1 F + / , (23) 

where I = P I is an arbitrary vector in the zero subspace. Since I does not alter the 
value of x 2 , however, we can discard it for convenience and approximate the optimal 
input spectrum by 

h = E T^R kl O k /al « J2 f^R kl . (24) 

ik ik 

Note that if det T ^ 0, (24) does reduce to I = R~ x O as expected. Numerically, 
Tj" 1 is obtained by the standard singular value decomposition method [16] in which 
the inverse of near zero eigenvalues is set to zero. We emphasize that the above 
deconvolution procedure is not an on-line process but is to be performed once at the 
end of the experiment. 

Propagation of the error during deconvolution is inevitable. Given (24) the de- 
convolution error is found to be 

^ = EE T^R kl /alfal k « £(E f^R ki f/O k . (25) 

k i k i 

This error increases as the jet energy increase because the number of counts decreases 
rapidly with energy. At some point this error exceeds the systematic error before the 
deconvolution. Beyond that point deconvolution is pointless and we have to live with 
the small distortions due to the network response. 
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Shown in Figure 4 is the optimal neural filtered jet distribution (dotted) compared 
to the input QCD distribution (solid line). We see that below 20 GeV, the neural filter 
significantly underestimates the QCD distribution, but that the distortions become 
small above that energy. The normalization of the QCD counts is adjusted to that 
expected at RHIC after a year of running. The filter noise is assumed to be the 
square root of the number of counts. The square symbols indicate the result of 
deconvoluting the filter response. We see that for E < 20 GeV, the deconvolution 
method accurately corrects for the distortions caused by the neural filter. Above that 
energy the deconvolution method begins to fail as error propagation overcomes the 
accuracy of the method. 



Deconvolution via Singular Value Decomposition 
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Figure 4: Comparison of the input QCD jet distribution (solid) to the convoluted 
network response distribution (dotted) and the deconvoluted network response 
(boxes) based on the singular value method. Note that the errors (long-dashed) 
propagating through the deconvolution begin to exceed the systematic bias of 
the network response (long dotted) beyond E £ 20 GeV. 



3.3 Constrained Optimization Method 

The deconvolution points in Figure 4 obtained using the singular value decomposition 
method obviously have a spurious large oscillating component. This is because the 
optimization procedure has over-fit the noise introduced into the response curve by 
the finite number of counts in each bin. To overcome this problem, we note that there 
is extra a priori knowledge about the jet spectrum which has not been used yet: the 
QCD spectrum always has a positive curvature (second derivative). To utilize that 
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information we add a penalty term (cost function) to the \ 2 °f the form 



C = aJ2 ex P{-^-i + 2h - Ii+i) ■ 

i 

Instead of Eq. (20), we then minimize 

— 2 -U P 



(26) 



(27) 



The C term acts to penalize negative curvature and thus smooths out the deconvo- 
lution. The error of the resulting solution I can then be estimated by the covariance 
matrix H -1 . where H is the Hessian 



d 2 e 



dlidlj 



evaluated at /. Note that unlike T matrix, matrix H is invertible here. 



Decovolution via Constrained Optimization 
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Figure 5: Comparison of the input QCD jet distribution (solid) to the convo- 
luted network response distribution (dotted) and the final deconvolution (boxes) 
using the constrained optimization method. The constraint punishes negative 
curvature. The statistical errors of the deconvolution are 1% to 7%, and the 
deconvoluted network response is within 10% of the desired input. 



Minimization of e err can be conveniently done by gradient decent. The corre- 
sponding constrained deconvolution result is shown in Figure 5. The statistical errors 
of the deconvolution are 1% to 7%, and the deconvoluted network response is within 
10% of the desired input. We see that the constraint term accurately corrects for the 
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distortions caused by the neural filter. It removes most of the oscillations in the sin- 
gular value decomposition method and reduces the error bars in energy range above 
20 GeV. It works remarkably well in the whole range from 4 GeV to 40 GeV. To 
reduce the computation time, one can start with the values calculated by the singular 
value decomposition method and then perform gradient decent to minimize e err . 

4 Discussion 

The results above demonstrate that the neural filter deconvolution algorithm proposed 
here can uncover the primordial jet spectrum in spite of the the loss of information 
in the transverse energy < 2 GeV region. However, it is also important to investigate 
the robustness of the algorithm to changes in the jet distribution and fragmentation 
function. Recall that jet analysis was originally proposed as a probe of the parton 
energy loss in dense matter in nuclear collisions and that new physics would manifest 
itself in a characteristic change of the apparent jet distribution [1, 2]. 

4.1 Robustness to Softened Jet Spectrum 

The optimal weights in Table 1 are based on the calculated PQCD spectrum 1(E) 
of jets through minimizing Eq. (7). Since the main interest in performing jet studies 
with nuclear collision is to look for deformations which may arise due to energy loss 
of the jet parton passing through dense matter [1, 2], we tested the response of the 
network to changing 1(E) — > I(E + 4). This simulates a 4 GeV energy shift of jet 
partons independent of their initial energy [2]. The result is shown in the Figure 6. 
It is clear that the constrained deconvolution method reproduces the input spectrum 
well in both cases. 

The reason for this is that the fragmentation function and the transverse momen- 
tum cut off are the same, and the network parameters are most sensitive to those 
two aspects. The network remains near optimal and the response function R(E',E) 
is unaffected by this type of modification. We conclude that any shift of the jet 
spectrum uncovered by the constrained deconvolution method reflects the underline 
physics and is not a spurious distortion caused by filtering out the low frequency 
noise. In the example studied, the method correctly uncovered the assumed 4 GeV 
energy loss. 

We note that in real applications, the network should be trained on-line with 
actual pp jet data where the PQCD jet distribution is known to be correct from a 
large body of prior experiments [3, 4]. With those data, the learning dynamics may 
train the network to a different point in weight space to compensate for the actual 
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efficiencies of the detector, the influence of noise, and physical differences from the 

LUND model. The cutoff parameters, E c and i?, should also be determined so as to 

optimize the overall jet finding efficiency. 

nucl-th/9305004 9 May 93 
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Figure 6: The robustness of the constrained deconvolution method is tested on 
two input spectra. The solid curve is the original PQCD spectrum 1(E). The 
dashed curve is an energy shifted spectrum I(E + 4). The same fragmentation 
function and weights are used in both cases. The output deconvolution points 
reproduce the input well in both cases. 



4.2 Modified Fragmentation 

A more challenging problem for the network is to expose it to jets that fragment 
differently than those it was trained on. In the previous section we assumed that 
energy loss in the medium only softens the hard parton spectrum before fragmentation 
but the jet-fragmentation function for leading hadrons remains unaffected by the 
nuclear medium. We now test the effect of modifying the fragmentation function 
itself. 

We explore next the possibility that the fragmentation function has modified 
medium effects so as to produce more hadrons along the jet axis with low energy 
and less at high-energy, as shown in Figure 7. To simulate "data" of this type we 
changed the fragmentation parameter a of the fragmentation probability distribution, 

f(z) = z'^l - z) a e- hm Tl z , (29) 
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in the LUND JETSET6.3 [15] code. In Figure 7, the hadron energy distributions for 
a 10 GeV quark jet are shown for the default value a = 0.5 and two others values 
a = 1.0 and 2.5. 



Modified Fragmentation Distributions for 10 GeV Jets 
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Figure 7: The hadron fragmentation distributions from a 10 GeV quark jet are 
shown for different fragmentation functions in which the parameter a of LUND 
JETSET6.3 is changed from 0.5 to 1.0 and to 2.5. For larger a, the fragmentation 
becomes softer in the sense that more hadrons are produced at lower energy and 
the high-energy hadrons are suppressed. 



The ratio of the constrained deconvoluted network response to the unmodified 
input PQCD spectrum 1(E) is shown in Figure 8. Note that the network parameters 
were optimized for default a = 0.5 fragmentation scheme. This ratio is seen to 
decrease systematically with increasing a. As the relative number of low energy 
particles increases the deconvoluted response is systematically lower than the actual 
primordial input distribution. This systematic shift reflects well the change in the 
underlying fragmentation physics and is again not an artifact of the filer. Therefore, 
deviations from the initial PQCD spectrum after deconvolution can be used to search 
for jet physics in AA that differs from that in pp. 

In Figure 9 we show that this difference can also be analyzed in terms of an average 
energy shift parameter, similar to that discussed in the previous section. Denoting 
the deconvoluted spectrum by 1(E), we can define an effective energy shift, AE(E) 



The resulting AE for the a = 1.0 and 2.5 modified fragmentation schemes is shown 



via 



I(E + AE(Ej) = 1(E) 
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Deconvoluted Response to Modified Fragmentation 
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Figure 8: The ratios of the deconvoluted network response spectra to the input 
PQCD spectrum for different fragmentation functions (see Figure 7) are shown 
as a function of the jet energy. The network weights are trained with the frag- 
mentation function with a = 0.5. The deconvoluted spectrum for a = 0.5 is 
within 1% to 7% of the input spectrum. For a = 1.0 the ratio is 20% below 
unity and for a = 2.5 the deconvoluted spectrum is about 50% below the input. 
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Effective Energy Shift for Modified Fragmentation 
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Figure 9: An analysis of the results in Figure (8) in terms of the effective jet 
energy loss AE equating the output and input spectra: I(E + AE) = 1(E). The 
results show that medium modified fragmentation functions can be characterized 
well by a single energy loss over a wide range of jet energies. 
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in Figure 9. Note that a constant AE —0.5 and —2.0 GeV characterizes well the 
difference in the physics in these cases over most of the interesting energy range. We 
conclude that AE deduced in this way provides a convenient and physically suggestive 
measure of the nuclear dependence of jet physics in A A. 

Comparison of Different Filters 

Hybrid Filter 
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Figure 10: The response curves for different filter weight configurations. The 
standard deviation and the bias of the network are plotted versus the input 
jet energy E in units of the cutoff energy E c = 2 GeV. Three different network 
configurations are considered: the optimal neural filter, the linear high-pass filter 
with w^ >1 = 1, and a hybrid leading two particle filter with wf> 3 = 0. 



4.3 Comparison with Other Filters 

While the bias reveals a systematic variation with k, the approximate constancy 
of all the weights wf >1 1 indicates that the global minimum in weight space is 
close to the point defining a simple linear high-pass filter (LHPF) characterized by 
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wf >1 = 1 for k > 1. This is a non-trivial result of the optimization procedure. We 
therefore also compare results obtained with the simplest LHPF network where only 
the biases Wq are determined so as to minimize the global error. As a further test of 
the proximity of the global minimum to the LHPF point, we also performed a hybrid 
network analysis in which only the energies of the leading two particles are utilized to 
estimate the jet energy. In the hybrid net we set wf >3 = 0, and determine the other 
weights as before. 

The performance of all three networks is compared in Figure 10. Shown are the 
dispersion and bias of network as a function of the initial jet transverse energy E 
of an isolated jet in units of the filter cutoff momentum E c = 2 GeV/c. We see 
that while the optimal neural filter has the overall best performance, the linear high- 
pass filter is only slightly worse. The hybrid two particle filter leads to considerably 
worse performance. We emphasize again that the convergence of the neural network 
to a point in weight space close to that defining a simple LHPF is not trivial and 
illustrates the power of the method. We could continue to guess different hybrid 
weight configurations. However, the learning algorithm explores the error surface 
and converges to the true global minimum in weight space without the necessity of 
guesses. For this particular problem with this particular fragmentation function it 
just so happens that the minimum is not far from the high-pass filter point. Training 
the network with real pp data or more sophisticated event generators may lead to a 
different conclusion. 

5 Summary 

We have proposed a neural network filtering and deconvolution method for jet anal- 
ysis to compensate for the loss of information in reactions where the background 
overwhelms the signal at low transverse energies. The numerical tests discussed here 
suggest that the method may be especially useful for application to nuclear collisions 
at RHIC and LHC energies, where a large number of minijets lead to an enormous 
background below E c ^2 — 3 GeV. We showed that if jet physics is unmodified by the 
nuclear environment, then the filtering and deconvolution method recovers accurately 
the expected PQCD spectrum. We tested the method also in two different physical 
scenarios where the spectrum of leading hadrons is modified by the nuclear medium. 
In one scenario, the jet is assumed to lose an average energy AE before fragment- 
ing as usual into the leading hadrons. We found that in this case the constrained 
deconvolution method accurately reproduces the shifted jet spectrum. In the second 
case, medium effects were assumed to lead a softening of the jet-fragmentation func- 
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tion. That scenario also led to a systematic shift from the input PQCD spectrum. 
We then showed, however, that the shift could also be well described by an average 
energy loss. Our main conclusion is that in spite of the large background expected 
in AA collisions which renders conventional jet analysis techniques useless, adaptive 
neurocomputation techniques can overcome effectively the loss of information at low 
transverse energies and help in the search for new physics. 

In closing, we point out several open problems that need further study in this 
connection. The present numerical study was limited for simplicity to the study of 
an isolated spectrum of quark jets with a threshold cutoff E c = 2 GeV to illustrate of 
the method. We have not addressed the problem of differentiating between quark and 
gluon fragmentation [10] nor the rejection efficiency of coincidence multi-jet events 
that happen by accident to fragment into the same angular cone R. The first problem 
can be addressed by training on "data" derived from more realistic event generators 
such as HIJING [2]. The second problem involves devising more efficient algorithms 
for calculating the relative rates of rare jets versus coincidental multiple jets. In 
principle, HIJING contains such backgrounds as well, but it is numerically impractical 
to study this at this time. A new method for triggering on coincident events would 
have to be implemented. Finally, the effects of finite resolution and detector biases 
should be investigated. The recovery of loss or distortion of information due to the 
measurement process is a separate problem requiring coupling a full event generator 
such as HIJING with a GEANT analysis [6] of detector response and possibly coupled 
with an adaptive tracking algorithm such as ET [14]. 
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